Direct and Unbiased Multiple Imputation Methods for Missing Values of Categorical Variables

نویسندگان

چکیده

Missing data is a common problem in statistical analyses. To make use of information with incomplete observation, missing values can be imputed so that standard methods used to analyze the data. Variables are often categorical and miss ing pattern may not monotone. Currently, commonly imputation for non-monotone do allow di rect inclusion variables. Categorical variables converted numerical before imputation. For many applications, those must then back values. However, this conversion introduces bias which seriously affect subsequent In paper, we propose two direct pattern: approach incorporated expectation maximization algorithm new algorithm: imputation-maximization algorithm. Simulation studies show both perform better than method using vari able conversion. An application real provided compare variable

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models with Local Dependence

We present a nonparametric Bayesian joint model for multivariate continuous and categorical variables, with the intention of developing a flexible engine for multiple imputation of missing values. The model fuses Dirichlet process mixtures of multinomial distributions for categorical variables with Dirichlet process mixtures of multivariate normal distributions for continuous variables. We inco...

متن کامل

A nonparametric multiple imputation approach for missing categorical data

BACKGROUND Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse (missingness) probabilities. METHODS We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each catego...

متن کامل

Missing Data and Imputation Methods in Partition of Variables

We deal with the effect of missing data under a ”Missing at Random Model” on classification of variables with non hierarchical methods. The partitions are compared by the Rand’s index.

متن کامل

Bayesian Multiple Imputation and Maximum Likelihood Methods for Missing Data

Bayesian multiple imputation and maximum likelihood provide useful strategy for dealing with dataset including missing values. Imputation methods affect the significance of test results and the quality of estimates. In this paper, the general procedures of multiple imputation and maximum likelihood described which include the normal-based analysis of a multiple imputed dataset. A Monte Carlo si...

متن کامل

Multiple Imputation of Missing or Faulty Values Under Linear Constraints

Many statistical agencies, survey organizations, and research centers collect data that su↵er from item nonresponse and erroneous or inconsistent values. These data may be required to satisfy linear constraints, e.g., bounds on individual variables and inequalities for ratios or sums of variables. Often these constraints are designed to identify faulty values, which then are blanked and imputed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of data science

سال: 2021

ISSN: ['1680-743X', '1683-8602']

DOI: https://doi.org/10.6339/jds.201207_10(3).0007